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The theory of island biogeograph^ 1 asserts that an island or a lo- 
cal community approaches an equilibrium species richness as a result 



of the interplay between the immigration of species from the much 
larger metacommunity source area and local extinction of species 
on the island (local community). Hubbell 2 generalized this neutral 
theory to explore the expected steady-state distribution of relative 
species abundance (RSA) in the local community under restricted 
immigration. Here we present a theoretical framework for the unified 
neutral theory of biodiversity 2 and an analytical solution for the dis- 
tribution of the RSA both in the metacommunity (Fisher's logseries) 
and in the local community, where there are fewer rare species. Rare 
species are more extinction-prone, and once they go locally extinct, 
they take longer to re-immigrate than do common species. Contrary 
to recent assertions 3 , we show that the analytical solution provides 
a better fit, with fewer free parameters, to the RSA distribution of 
tree species on Barro Colorado Island (BCI) 4 than the lognormal 
distribution. 

The neutral theory in ecologj^"^ seeks to capture the influence of speci- 
ation, extinction, dispersal, and ecological drift on the RSA under the as- 
sumption that all species are demographically alike on a per capita basis. 
This assumption, while only an approximation^ ^E^, appears to provide a 
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useful description of an ecological community on some spatial and temporal 
scales^"^. More significantly, it allows the development of a tractable null 
theory for testing hypotheses about community assembly rules. However, 
until now, there has been no analytical derivation of the expected equilib- 
rium distribution of RSA in the local community, and fits to the theory 
have required simulations^ with associated problems of convergence times, 
unspecified stopping rules, and precisiorP. 

The dynamics of the population of a given species is governed by gen- 
eralized birth and death events (including speciation, immigration and em- 
igration). Let b Ut k and d n ,k represent the probabilities of birth and death, 
respectively, in the fc-th species with n individuals with = d o k = 0. Let 
Pn,k{t) denote the probability that the k-th species contains n individuals at 
time t. In the simplest scenario, the time evolution of p n ,h(t) is regulated by 
the master equation^ ^^21 



^ ^ ' ~ Pn+l,k(t)d n +l,k + Pn-l,k(t)b n -l,k ~ Pn,k{t){b n ,k + d n ,k) (1) 



which leads to the steady-state or equilibrium solution, denoted by P: 
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n— 1 jj 

Pn,k = Po,k II j " ' ( 2 ) 
i=0 

for n > and where Po,fc can be deduced from the normalization condition 
J2 n Pn,k = 1- Note that there is no requirement of conservation of community 
size. One can show that the system is guaranteed to reach the stationary 
solution in the infinite time limil^l 

The frequency of species containing n individuals is given by 

5 

4>n = Yl ( 3 ) 
k=l 

where S is the total number of species and the indicator 1^ is a random 
variable which takes the value 1 with probability P n ^ and with probability 
(1 — P nj fc). Thus the average number of species containing n individuals is 
given by 

(0n) = EPn, fc - (4) 
k=l 

The RSA relationship we seek to derive is the dependence of (0 n ) on n. 

Let a community consist of species with b n ^ = b n and d n ^ = d n being 
independent of k (the species are assumed to be demographically identical). 
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From Eq.(jlJ), it follows that (<j> n ) is simply proportional to P n , leading to 

(<f>n) = SP i[-^. (5) 

i=0 

We consider a metacommunity in which the probability d that an individ- 
ual dies and the probability b that an individual gives birth to an offspring 
are independent of the population of the species to which it belongs (density 
independent case), i.e. b n = bn and d n = dn (n > 0). Speciation may be 
introduced by ascribing a non-zero probability of the appearance of an indi- 
vidual of a new species, i.e. b = v ^ 0. Substituting the expressions into 
Eq.(J3J), one obtains the celebrated Fisher logseries^ : 

(<Pn ) = SmPq-t-, — -r- = 0—, (6) 
d\d 2 ...d n n 

where M refers to the metacommunity, x = b/d and 9 = SmPov/b is the 
biodiversity parameter (also called Fisher's a). We follow the notation of 
HubbelP in this paper. Note that x represents the ratio of effective per 
capita birth rate to the death rate arising from a variety of causes such 
as birth, death, immigration and emigration. Note that in the absence of 
speciation, bo = v = 9 = 0, and, in equilibrium, there are no individuals in 
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the metacommunity. When one introduces speciation, x has to be less than 
1 to maintain a finite metacommunity size Jm = Dn n (0n) = 

We turn now to the case of a local community of size J undergoing births 
and deaths accompanied by a steady immigration of individuals from the sur- 
rounding metacommunity. When the local community is semi-isolated from 
the metacommunity, one may introduce an immigration rate m, which is the 
probability of immigration from the metacommunity to the local community. 
For constant m (independent of species), immigrants belonging to the more 
abundant species in the metacommunity will arrive in the local community 
more frequently than those of rarer species. 

Our central result (see Box 1 for a derivation) is an analytic expression 
for the RSA of the local community: 



( ^ >= V-n)ir(J + 7 ) 7o r(l + y) r( 7 -y) 

(7) 

where T(z) = / °° t z ~ 1 e~ t dt which is equal to (z — 1)! for integer z and 
7 = m £^~^ . As expected, (<f> n ) is zero when n exceeds J. The computer 
calculations in Hubbell's boolPas well as those more recently carried out by 
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McGilP were aimed at estimating by simulating the processes of birth, 
death and immigration. 

One can evaluate the integral in Eq.((7j) numerically for a given set of 
parameters: J, 9 and m. For large values of n, the integral can be evaluated 
very accurately and efficiently using the method of steepest descentP^. Any 
given RSA data set contains information about the local community size, 
J, and the total number of species in the local community, Sl = Sfc=i 
Thus there is just one free fitting parameter at one's disposal. 

McGill assertecP that the lognormal distribution is a more parsimonious 
null hypothesis than the neutral theory, a suggestion which is not borne out 
by our reanalysis of the BCI data. We focus only on the BCI data set be- 
cause, as pointed out by McGilP, the North American Breeding Bird Survey 
data are not as exhaustively sampled as the BCI data set, resulting in fewer 
individuals and species in any given year in a given location. Furthermore, 
McGills analysis seems to rely on adding the bird counts of 5 years at the 
same sampling locations even though these data sets are not independent. 

Figure 1 shows a Preston-like binnin gE of the BCI datcP and the fit of our 
analytic expression with one free parameter (11 degrees of freedom) along 
with a lognormal having three free parameters (9 degrees of freedom). Stan- 
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dard chi-square analysis yields values of \ 2 — 3-20 for the neutral theory 
and 3.89 for the lognormal. The probabilities of such good agreement arising 
by chance are 1.23% and 8.14% for the neutral theory and lognormal fits, 
respectively. Thus one obtains a better fit of the data with the analytical 
solution to the neutral theory to BCI than with the lognormal, even though 
there are two fewer free parameters. McGill's analyst on the BCI data set 
was based on computer simulations in which there were difficulties in know- 
ing when to stop the simulations, i.e. when equilibrium had been reached. It 
is unclear whether McGill averaged over an ensemble of runs, which is essen- 
tial to obtain repeatable and reliable results from simulations of stochastic 
processes because of their inherent noisiness. However, simulations of the 
neutral theory are no longer necessary, and all problems with simulations are 
moot, because an analytical solution is now available. 

The lognormal distribution is biologically less informative and mathemat- 
ically less acceptable as a dynamical null hypothesis for the distribution of 
RSA than the neutral theory. The parameters of the neutral theory or RSA 
are directly interpretable in terms of birth and death rates, immigration 
rates, size of the metacommunity, and speciation rates. A dynamical model 
of a community cannot yield a lognormal distribution with finite variance 
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because in its time evolution, the variance increases through time without 
bound. However, as shown by Sugihara et alP^, the lognormal distribution 
can arise in static models, such as those based on niche hierarchy. 

The steady-state deficit in the number of rare species compared to that 
expected under the logseries can also occur because rare species grow differ- 
entially faster than common species and therefore move up and out of the 
rarest abundance categories due to their rare species advantage^. Indeed, it 
is likely that several different models (e.g. an empirical lognormal distribu- 
tion, niche hierarchy models^ or the theory presented here) might provide 
comparable fits to the RSA data (we have found that the lognormal does 
slightly better than the neutral theory for the Pasoh data selP^, a tropical 
tree community in Malaysia). Such fitting exercises in and of themselves, 
however, do not constitute an adequate test of the underlying theory. Neu- 
tral theory predicts that the degree of skewing of the RSA distribution ought 
to increase as the rate of immigration into the local community decreases. 
Dynamic data on rates of birth, death, dispersal and immigration are needed 
to evaluate the assumptions of neutral theory and determine the role played 
by niche differentiation in the assembly of ecological communities. 

Our analysis should also apply to the field of population genetics in which 
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the mutation-extinction equilibrium of neutral allele frequencies at a given 
locus has been studied for several decades El 121 ESI EHUD 113 
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Box 1 

Derivation of the RSA of the local community 

We study the dynamics within a local community following the mathe- 
matical framework of McKane et alPP, who studied a mean-field stochastic 
model for species- rich assembled communities. In our context, the dynamical 
ruled^ governing the stochastic processes in the community are: 

1) With probability 1 — m, pick two individuals at random from the local 
community If they belong to the same species, no action is taken. Otherwise, 
with equal probability, replace one of the individuals with the offspring of 
the other. In other words, the two individuals serve as candidates for death 
and parenthood. 

2) With probability m, pick one individual at random from the local com- 
munity. Replace it by a new individual chosen with a probability proportional 
to the abundance of its species in the metacommunity. This corresponds to 
the death of the chosen individual in the local community followed by the 
arrival of an immigrant from the metacommunity. Note that the sole mech- 
anism for replenishing species in the local community is immigration from 
the metacommunity, which for the purposes of local community dynamics 
is treated as a permanent source pool of species, as in the theory of island 
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biogeography^. 



These rules are encapsulated in the following expressions for effective 
birth and death rates for the k-th species: 



where /ij. is the abundance of the k-th species in the metacommunity and Jm 
is the total population of the metacommunity. 

The right hand side of Eq. (JHJ) consists of two terms. The first corresponds 
to Rule (1) with a birth in the k-th species accompanied by a death elsewhere 
in the local community. The second term accounts for an increase of the 
population of the k-th species due to immigration from the metacommunity. 
The immigration is, of course, proportional to the relative abundance Jm 
of the k-th species in the metacommunity. Eq.fjUJ) follows in a similar manner. 
Note that b n ^ and d n ^ not only depend on the species label k but also are 
no longer simply proportional to n. 

Substituting Eq.fjHJ) and Eq.Q into Eq.Q, one obtains the expression^ 
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3 J ] - r(n + Afc) r(fl fc - n) T(A fc + ^fc - J) _ , 

"' fc "n!(J-n)! T(A fc ) - J) r(A fc + tf fc ) " ^ fcj ' UUj 



where 



Tn . . Liu , 
-J-l-^ (11) 



:i-m v 'J 



and 



^ = J +7 ^(J-l)(l-^). (12) 
(1-m) V Jm/ 

Note that the /c dependance in Eq. lfTUJ) enters only through On sub- 
stituting Eq. ffTUj) into Eq.(jlJ), one obtains 



) = F(ji k ) = S M (F(jji k )) = S M / dfjip(ji)F(ji). (13) 
fc=i ■' 



Here is the probability distribution of the mean populations of the 

species in the metacommunity and has the form of the familiar Fisher logseries 
(in a singularity-free description^"^) 
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Mdfi = T ^ §£ exp(-/i/<5)/i £ l dn, (14) 

where 5 = ^z - - Substituting Eq.(fHj) into the integral in Eq.([T3j). taking the 
limits Sm oo and e — > with 9 = Sm£ approaching a finite value^"^and 
on defining y = /i^, one obtains our central result Eq.((7J). 
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BCI Plot 
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Figure 1: Data on tree species abundances in 50 hectare plot of tropical 
forest in Barro Colorado Island, Panama taken from Condit et alP. The total 
number of trees in the dataset is 21457 and the number of distinct species 
is 225. The red bars are observed numbers of species binned into log(2) 
abundance categories, following Preston's methocP. The first histogram bar 
represents the second bar + ^yl, the third bar + (0 3 ) + ^n-, 
the fourth bar + (0 5 ) + (0 6 ) + (0 7 ) + and so on. The black curve 
shows the best fit to a lognormal distribution (</>„) = — exp(— ( log 2"~^° s 2 n °) ^ 
(N = 46.29, uq = 20.82 and o = 2.98), while the green curve is the best fit to 
our analytic expression Eq.(|7J) (m = 0.1 from which one obtains 9 = 47.226 
compared to the HubbelP estimates of 0.1 and 50 respectively and McGill's 
best fitsPof 0.079 and 48.5 respectively.) 
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